Skip to content

Intl: Add a new IntlNumberRangeFormatter class #19232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

BogdanUngureanu
Copy link
Contributor

closes #18599

Adds support for ICU's NumberRangeFormatter that returns a locale-aware string interval given 2 numbers. Instead of using the C API, I've opted for the C++ one because it offers a few more possibilities in the future (it could combined with a new NumberFormatter ICU class) while the C version only allows support for skeleton strings.

Since the new NumberFormatter class hasn't been added to PHP Intl yet, the implementation uses skeletons for now.

The implementation uses a factory method that returns a new IntlNumberRangeFormatter object, while the constructor is private. I went with this approach because it would allow the class to have a second factory method in the future for different number formatting (e.g. NumberFormatter). Something like

createFromSkeleton(...)
createFromNumberFormatter()

Drawbacks of the current PHP API:
The C++ version allows to configure the NumberFormatter/skeleton for $start and $end individually (using a fluent interface). The PHP API I propose sets it for both numbers
C++ example code:

NumberRangeFormatter::with()
    .identityFallback(UNUM_IDENTITY_FALLBACK_APPROXIMATELY_OR_SINGLE_VALUE)
    .numberFormatterFirst(NumberFormatter::with().adoptUnit(MeasureUnit::createMeter()))
    .numberFormatterSecond(NumberFormatter::with().adoptUnit(MeasureUnit::createKilometer()))
    .locale("en-GB")
    .formatFormattableRange(750, 1.2, status)
    .toString(status);
// => "750 m - 1.2 km"

and a C example

// Setup:
UErrorCode ec = U_ZERO_ERROR;
UNumberRangeFormatter* uformatter = unumrf_openForSkeletonCollapseIdentityFallbackAndLocaleWithError(
    u"currency/USD precision-integer",
    -1,
    UNUM_RANGE_COLLAPSE_AUTO,
    UNUM_IDENTITY_FALLBACK_APPROXIMATELY,
    "en-US",
    NULL,
    &ec);
UFormattedNumberRange* uresult = unumrf_openResult(&ec);
if (U_FAILURE(ec)) { return; }

// Format a double range:
unumrf_formatDoubleRange(uformatter, 3.0, 5.0, uresult, &ec);
if (U_FAILURE(ec)) { return; }

// Get the result string:
int32_t len;
const UChar* str = ufmtval_getString(unumrf_resultAsValue(uresult, &ec), &len, &ec);
if (U_FAILURE(ec)) { return; }
// str should equal "$3 – $5"

While the code works, the work is still in progress - I still have to implement a proper error handling. In the meantime, I would like to hear your opinions about the PHP interface I'm proposing.


private function __construct() {}

public static function createFromSkeleton(string $skeleton, string $locale, int $collapse, int $identityFallback): IntlNumberRangeFormatter {}
Copy link
Member

@kocsismate kocsismate Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope I won't upset you too much, but I think these changes will need an RFC, because any API changes involve a lot of bikeshedding lately.

For example, we have started to use enums more often (see my https://wiki.php.net/rfc/url_parsing_api RFC), so I think this practice could be continued in your case too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I see that you have recently added the ListFormatter class in #18519 that also has some enum-like constants. It probably also makes sense to stay consistent with the current convention of intl (class constants), so I don't insist on my above suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide that we should employ nicer features in intl, we should do a thorough review of not only the constants ,but other mechanisms (like error handling) as well. And then we can make one compelling holistic RFC (if we would desire to do so ;) ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I see that you have recently added the ListFormatter class in #18519 that also has some enum-like constants. It probably also makes sense to stay consistent with the current convention of intl (class constants), so I don't insist on my above suggestion.

Yeah, that was my reasoning. Since that one didn't need an RFC, I thought this one also won't need it.

If we decide that we should employ nicer features in intl, we should do a thorough review of not only the constants ,but other mechanisms (like error handling) as well. And then we can make one compelling holistic RFC (if we would desire to do so ;) ).

Sounds fair. I think for now the API I'm proposing is ok. I do agree introducing the new NumberFormatter will require an RFC.

@BogdanUngureanu BogdanUngureanu changed the title WIP - Intl: Add a new IntlNumberRangeFormatter class Intl: Add a new IntlNumberRangeFormatter class Aug 2, 2025
@BogdanUngureanu BogdanUngureanu force-pushed the ext-intl-add-number-range-formatter branch from d55c59b to 0e083db Compare August 2, 2025 22:35
@BogdanUngureanu
Copy link
Contributor Author

BogdanUngureanu commented Aug 3, 2025

The test failure for [Push / WINDOWS_X64_ZTS (pull_request)](https://github.com/php/php-src/actions/runs/16698635741/job/47266273580?pr=19232)Failing after 31m seems to be a false positive? It doesn't fail for the other ones.

@devnexen @kocsismate when you have some spare time, can you look at the PR again? Thanks.

@BogdanUngureanu BogdanUngureanu force-pushed the ext-intl-add-number-range-formatter branch from f598e0e to 9e6cd5c Compare August 7, 2025 22:32
@BogdanUngureanu BogdanUngureanu force-pushed the ext-intl-add-number-range-formatter branch from 9e6cd5c to 9e7da4e Compare August 7, 2025 22:33
Comment on lines +158 to +159
intl_error_set(NULL, error, "Failed to format number range");
zend_throw_exception(IntlException_ce_ptr, "Failed to format number range", 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intl_error_set may already throw an exception, or emit a warning, or do nothing depending on the error mode selected.

If you decide to always throw, then just throw an exception directly rather than going through its internal error mechanism.
Or you need to do what I did for constructors:

	const bool old_use_exception = INTL_G(use_exceptions);
	const zend_long old_error_level = INTL_G(error_level);
	INTL_G(use_exceptions) = true;
	INTL_G(error_level) = 0;
	intl_error_set(NULL, error, "Failed to format number range");
	INTL_G(use_exceptions) = old_use_exception;
	INTL_G(error_level) = old_error_level;

However, if it seems you always just throw exceptions for any failure condition. So always throwing exceptions would allow you to forego carrying the intl_error struct around in rangeformatter_data which would make the overall RangeFormatter class take a smaller memory footprint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you decide to always throw, then just throw an exception directly rather than going through its internal error mechanism.

I've used intl_error_set so the behavior is consistent with the other classes. Without it, functions like intl_get_error_code() and intl_get_error_message() likely won’t work properly. It’s also used for IntlNumberRangeFormatter::getErrorCode() and IntlNumberRangeFormatter::getErrorMessage().

Another option would be to drop the exception and rely on intl_error_set, but that means format will have to return a falsy value on failure, which personally I would prefer not to because the return value signature will need to be changed to string|false.

I'm fine either way, just wanted to mention the reasoning. :)


U_CFUNC PHP_METHOD(IntlNumberRangeFormatter, createFromSkeleton)
{
#if U_ICU_VERSION_MAJOR_NUM < 63
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? So how will this even compile then on older versions? Shouldn't this entire class be compiled conditionally instead of adding exception code at the top?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct, it would compile to other versions. I think it's more developer friendly to know why the class is not available with their configuration - doing it like this, the developer knows that the class is not available because of the ICU constraint. Otherwise, they'll get a fatal for a non-existent class which would feel confusing imo.

It was previously discussed here too. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[intl] Expose the ICU NumberRangeFormatter
5 participants